Identification and Classification of Proper Nouns in Chinese Texts

نویسندگان

  • Hsin-Hsi Chen
  • Jen-Chang Lee
چکیده

Various strategies are proposed to identify and classify three types of proper nouns in Chinese texts. Clues from character, sentence and paragraph levels are employed to resolve Chinese personal names. Character, Syllable and Frequency Conditions are presented to treat transliterated personal names, To deal with organization names, keywords, prefix, word association and parts-of-speech are applied. For fair evaluation, large scale test data are selected from six sections of a newspaper. The precision and the recall for these three types are (88.04%, 92.56%), (50.62%, 71.93%) and (61.79%, 54.50%), respectively. When the former two types are regarded as a category, the performance becomes (81.46%, 91.22%). Compared with other approaches, our approach has better performance and our classification is automatic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Proper name knowledge acquisition for text understanding

Current work in proper name analysis is focused on identification and limited categorisation of names. Some research has been carried out in acquiring knowledge of proper names from the contextual information within texts. In this study, we investigate how to transform human-oriented compilations, which contain a rich knowledge of proper names, into formallyrepresented knowledge for computer co...

متن کامل

Semantic Classification of Chinese Unknown Words

This paper describes a classifier that assigns semantic thesaurus categories to unknown Chinese words (words not already in the CiLin thesaurus and the Chinese Electronic Dictionary, but in the Sinica Corpus). The focus of the paper differs in two ways from previous research in this particular area. Prior research in Chinese unknown words mostly focused on proper nouns (Lee 1993, Lee, Lee and C...

متن کامل

Named Entity Recognition in Assamese

Named Entity Recognition is a process through which a program extracts proper nouns in texts and associates them with a proper tag. NER has made significant progress in European languages, but in Indian languages due to the lack of effort as well as proper resources, it remains a challenging task. Recognizing ambiguities and assigning the correct tags to the names is the main goal of NER. Thus ...

متن کامل

Categorization And Standardizing Proper Nouns For Efficient Information Retrieval

In this paper, we describe the most recent implementation and evaluation of the proper noun categorization and standardization module of the DRLINK document detection system being developed at Syracuse University, under the auspices of ARPA's TIPSTER program. We also discuss the expansion of group common nouns and group proper nouns to enhance retrieval recall. Successful proper noun boundary i...

متن کامل

Translation Quality Assessment of English Equivalents of Persian Proper Nouns: A case of bilingual tourist signposts in Isfahan

Abstract This study evaluated the translation quality of English equivalents of Persian proper nouns in the tourist signs and bilingual boards in Isfahan. To find different errors in the translations of the bilingual boards and tourist signs, the data were collected directly by taking picture or writing exactly from the available tourist signs and bilingual boards. Then, the errors were assesse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996